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STATISTICAL  INFERENCES  IN  CROSS-LAGGED  PANEL  STUDIES 


1.  Introduction 


Panel  studies  are  statistical  studies  in  which  two  or  more  variables  are 
observed  for  two  or  more  subjects  at  two  or  more  waves  (points  in  time). 
In  most  panel  studies  the  number  of  variables  and  the  number  of  waves  is 
small  but  the  number  of  observations  at  any  given  wave  is  large.  Cross- 
lagged  panel  studies  are  those  studies  for  which  the  variables  are 
continuous  and  the  purpose  of  the  study  is  to  examine  the  cross-effects 
which  are  the  impact  of  one  set  of  variables  on  another  over  time.  Such 
studies  have  been  used  in  a  variety  of  social  and  behavioral  studies 
including  studies  which  examine  such  controversial  issues  as  the  effect 
of  IQ  on  achievement  for  school  children  and  the  effect  of  expenditures 
for  police  on  a  city's  crime  rate  (e.g.  Crano,  Kenny,  and  Campbell 
(1972),  Greenberg,  Kessler,  and  Logan  (1979),  and  Eaton  (1978)).  The 
purpose  of  this  paper  is  to  contribute  to  the  statistical  methods  used 
in  such  studies. 

Methods  for  analyzing  cross-lagged  panel  studies  have  been  developed 
over  the  past  20  years.  Early  methods  were  guided  by  the  work  of  Donald 
Campbell  (1963)  and  were  motivated,  to  some  degree,  by  the  seminal  work 
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particular  gratitude  to  T.W.  Anderson,  D.R.  Rogosa,  S.S.  Carroll  and 
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Visiting  Scholar,  Department  of  Statistics,  Stanford  University. 


2 


of  Paul  Lazarsfeld  (1948)  on  analyzing  panel  studies  involving  discrete 
data.  The  methods  proposed  by  Campbell  and  developed  by  him  and  his 
colleagues  (e.g.  Cook  and  Campbell  (1979)  and  Kenny  (1979))  focused  on 
the  logic  of  using  the  correlational  structure  obtained  from  panel  data 
to  examine  for  the  presence  of  cross-effects.  For  the  most  part 
sampling  variation  was  ignored.  Later  authors  argued  that  the  presence 
of  cross-effects  was  best  examined  by  using  a  regression  approach  which 
involves  formulating  regression  models  of  the  responses,  interpreting 
the  cross-effects  in  terms  of  regression  parameters,  and  using  standard 
statistical  methods  to  make  inferences  about  the  regression  parameters 
and  thus  about  the  presence  of  cross-effects  (e.g.  Pelz  and  Andrews 
(1964),  Duncan  (1969),  Heise  (1970)  and  Rogosa  (1980)).  Kessler  and 
Greenberg  (1981)  have  provided  an  excellent  review  of  the  development  of 
cross-lagged  panel  methods. 

We  contribute  to  the  regression  approach  to  the  analysis  of  cross-lagged 
panel  studies  by  examining  issues  that  affect  the  optimality  of  the 
methods  used  to  estimate  the  regression  parameters  and  to  test  hypothe¬ 
ses  about  the  presence  of  cross-effects.  The  first  issue  is  the 
simultaneous  nature  of  the  regression  models  which  arises  from  the  fact 
that  the  regression  approach  can  allow  correlation  among  errors 
associated  with  different  equations.  The  second  is  the  assumption  made 
on  the  observations  of  the  initial  wave,  they  can  be  fixed  or  observa¬ 
tional  in  nature.  Direct  application  of  regression  methods  to  panel 
studies  often  ignores  the  possibility  of  stochastic  behavior  for  the 
initial  observations.  The  third  is  the  fact  that  the  regression 
parameters  may  not  be  homogeneous  in  the  sense  that  they  change  over 


waves. 
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Our  approach  is  to  define  a  statistical  model  for  the  panel  study  and 
then  to  consider  the  problems  of  estimation  and  testing  for  the 
parameters  in  the  model.  We  borrow  heavily  from  the  theory  of  multi¬ 
variate  linear  models  (Anderson  (1984),  Rao  (1973),  and  Dunteman  (1984)) 
and  from  the  theory  of  replicated  vector-valued  autoregressive  processes 
(Anderson  (1978)). 


In  the  next  section  we  introduce  the  model  and  set  the  notation.  In  the 
third  section  we  present  our  results  for  the  two-wave  panel  study  and  in 
the  fourth  section  we  extend  these  results  to  the  multi-wave  panel.  We 
then  apply  our  results  to  a  panel  study  of  the  attitudes  and  perceptions 
of  patients  in  a  health  maintenance  organization  and  conclude  with 
remarks  on  our  current  efforts. 


2.  Specification  of  the  Model 

Let  ^  variables  measured  on  the  ith  of  n 

independent  subjects  at  wave  t  (t  =  0,  . . . ,  T)  of  a  panel  study. 

Suppose  the  variables  divide  into  two  sets  as  indicated  by  the  partition 
2^^  ~  )  ’  where  x^^  and  are  of  dimensions  p  and  q  respec¬ 

tively  and  p  +  q  =  k.  The  emphasis  of  the  study  is  to  estimate  and 

test  the  effects  of  x.  ,  and  y.  ,  on  x.  and  y 

'''l,t-l  'il,t-l  'Vlt  'lit 


The  multivariate  regression  structure  is 


2  =  B  z  .  ,  +  e , 

'v^it  '\<i,t-l  'uit 


(i  1,  ...,  n^  t  —  1,  ...,  T) 


(2.1) 
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where  the  unobserved  error  vectors  independent  and  identically 
distributed  random  vectors  with  mean  and  covariance  matrix  and  is 
a  matrix  of  unknown  regression  coefficients. 

The  vector  e.  and  matrices  ^  =  (b  )  and  |  *  (a  )  are  partitioned 
conformally  with  to  give 


-'^it 

^11 

^^12 

'111 

I12 

e  s 

,(2) 

'lit 

3  = 

and  t  = 

^22 

'0 

I21 

I22 

In  matrix  notation  the  model  is  summarized  as 
^t  ^  ^t-1  ^t 

where  has  ith  row  has  ith  row 

'^it’  covariance  matrix  of  ^  can  be  written  where  (0) 

is  the  Kronecker  product  of  linear  algebra. 


Specification  of  the  model  is  completed  by  specifying  the  behavior  of 

the  observation  matrix  of  the  initial  wave,  We  either  assume  that 

^  is  fixed  or  assume  that  the  rows  of  ^  are  independent  normal  random 

vectors  with  common  mean  and  common  covariance  matrix  ^ 

1  ^ 

fixed  we  let  ^  =  “  ^  ^iO^iO  assume  =  lim  exists  and  is  of 

i=l 

full  rank. 


If  ^  is  random  then  the  covariance  matrix  0^^  of  satisfies 


0,  =  B  0„  B'  + 

'\j1  'Vj  '\j0  ''J 


(2.2) 


To  highlight  the  multivariate  nature  of  the  model  we  consider  the  simple 
case  of  p  =  q  =  1  and  T  =  1.  The  model  in  (2,1)  reduces  to 
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il 

^^2 

^iO^ 

il 

■  '’21*10 

+  b22 

^iO^ 

=11’ 

(2.3) 


The  equations  look  like  ordinary  multiple  regression  equations  with  the 

added  condition  that  the  errors  and  are  allowed  to  be 

It  It 

correlated. 


3.  Two-wave  Panel  Studies 

We  consider  the  problems  of  parameter  estimation  and  hypothesis  testing 
in  the  two-wave  panel  study.  Such  consideration  is  pertinent  because 
published  applications  are  often  two-wave  studies  (e.g.  Crano,  Kenny  and 
Campbell  (1979))  and  methodological  papers  have  been  devoted  to  the  sta¬ 
tistical  analysis  of  such  studies  (e.g.  Duncan  (1969)).  Furthermore  the 
results  for  two-wave  studies  are  somewhat  simpler  than  the  results  for 
multi-wave  studies  and  thus  "set  the  stage"  for  more  difficult  results. 

We  consider  the  problem  of  estimating  the  regression  parameter  matrix 
B  and  covariance  matrix  ^  and  then  consider  the  problem  of  assessing 
the  size  of  the  effects  and  cross-effects.  Technical  aspects  of  the 
results  are  discussed  briefly  in  an  appendix. 


The  analysis  of  the  two-wave  model  begins  by  defining  two  estimators 


n 


n 


^  KO'  Ko  '“1 


.  ,  t-iO^iO^  %iO'"il 

x=l  1=1 


n 


(3.1) 
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A  ^ 

The  vector  ,  the  jth  row  of  ]^,  contains  the  regression  coefficients 
which  can  be  obtained  by  regressing  the  jth  variable  at  wave  1, 


.(1) 


on  ,  . , . ,  z. 


(k) 


iO 


all  k  variables  at  wave  0,  using  a  standard  multiple 
regression  routine.  The  covariance  estimator  ^  cannot  be  obtained 
directly  from  the  multiple  regression  output  but  can  be  computed  using 
^  and  the  numerical  calculation  indicated  in  (3.1). 


3,1  Estimation  of  the  Effects  and  Covariances 

The  use  of  the  estimators  in  (3.1)  can  be  justified,  in  the  case  of 
fixed  initial  observations,  by  applying  results  of  multivariate  linear 
models.  We  begin  with 


Theorem  1 ;  For  a  two-wave  panel  study  with  fixed  initial  observations 

A  A 

and  normal  errors  the  estimators  ^  and  ^  are  maximum  likelihood. 


the  jth  row  of  is  a  minimum  variance  unbiased  estimator  of  , 
the  jth  row  of  and  the  vectors  normal  random 

vectors  with  mean  and  covariance  matrices 


Var(b^)  -  Ojj 

Cov(b^,  b^)  =  (j?4k) 


(3.2) 


This  result  follows  directly  from  the  theory  of  multivariate  regression 
analysis  (e.g.  (Anderson  (1984),  Rao  (1973)).  See  the  appendix  for 


additional  comments. 
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If  the  assumption  of  normal  errors  is  relaxed  then  Theorem  1  is 
replaced  by 

Theorem  2 :  For  a  two-wave  panel  model  with  fixed  initial  observations 

the  estimator  ~  • • • ♦  h)  is  a  minimum  variance  linear 

unbiased  estimator  and  ~  large  n,  approximately  a 

normal  random  vector  with  the  mean  and  variance  given  in  theorem  1  with 

replaced  by  (^  =  lim  . 

n-*-» 

This  result,  like  Theorem  1,  follows  from  applying  the  theory  of  multi¬ 
variate  linear  models. 

The  covariance  matrices  of  i/n(b.  -  b.)  in  Theorems  1  and  2  must  be  calcu- 
lated  with  the  formulae  indicated  if  a  multiple  regression  routine  is 
used  to  estimate  the  model  but  are  provided  as  direct  output  if  a 
multivariate  regression  routine,  such  as  the  routine  in  SAS,  is  used. 

To  this  point  we  have  assumed  that  the  initial  observation  matrix  Z„  is 

■\>0 

fixed.  Often  panel  studies  are  completely  observational  in  nature  and 
it  may  be  more  realistic  to  assume  that  ^  is  random.  For  this  case 
we  have 

Theorem  3 ;  For  a  two-wave  panel  model  with  normal  initial  observations 

A  A 

and  normal  errors,  the  estimators  ^  and  given  in  (3.1)  are  maximum 
likelihood  and  the  vectors  {/n(b^  ~  large  n,  approximately 

normal  random  vectors  with  mean  0  and  covariance  matrices 

'V/ 
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cov[/n(b^  "  (3.3) 

■cov[/n(b.  »  b^)./n(b^  -  b,)]  = 

a_  ^ 

Furthermore,  the  large-sample  distribution  of  '(’^(bj  ”  remains  the 

same  if  the  condition  of  normal  errors  is  relaxed. 


The  proof  follows  from  the  theory  of  replicated  vector-valued  auto¬ 
regressive  processes  (Anderson  (1978))  and  is  discussed  in  the  appendix. 

Although  the  estimators  given  in  (3.1)  are  maximum  likelihood  estimators 
for  the  model  with  normal  initial  observations  and  normal  errors,  the 
model  does  contain  an  additional  vector  ^  and  an  additional  covariance 
matrix  0„,  and  both  must  be  estimated.  They  are  the  mean  vector  and 
covariance  matrix  of  the  initial  observation  vector 


The  maximum  likelihood  estimators  are 
n  ^  n 


)i0  ~  ^0  ~  ^'^iO  ■  V^'tiO  " 


(3.4) 


For  the  model  with  random  initial  observations  computation  of  B  can  be 
done  by  a  multiple  regression  routine  as  was  true  for  the  model  with 
fixed  initial  observations,  but  computation  of  the  maximum  likelihood 
estimators  of  and  0^  would  require  numerical  calculation  with 

the  formulae  in  (3.1)  and  (3.4). 


Similarly  the  covariance  matrix  of  /n(b.  -  b  )  can  be  computed  by 
numerical  calculation  using  the  formula  in  (3.2). 
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If  a  multivariate  regression  routine  is  used  then  the  estimators  of 
and  the  covariance  matrix  of  are  given  as  output;  the  esti¬ 

mators  of  and  must  be  evaluated  by  numerical  calculation  using  the 


formula  in  (3.4). 


Thus  far  we  have  assumed  that  the  covariance  matrix  of  the  initial  wave 
is  different  than  the  covariance  matrix  of  the  first  wave.  Alterna¬ 
tively,  we  could  adopt  an  assumption  common  in  econometrics,  that  the 
process  has  a  long  but  unobserved  history  and  thus  has  become  stationary 
by  wave  0.  Then  and  would  have  identical  covariance  matrices, 
formally  =  0^^.  This  identity  and  the  structure  of  the  model  imply 
that  0^  ~  ^  ^  thus  the  covariance  matrix  0^  would  satisfies 

+  I  (3.5) 

Incorporating  expression  (3.5)  as  a  constraint  in  the  maximum  likelihood 
optimization  involves  maximizing  an  equation  which  is  highly  non-linear 
in  the  elements  of  Optimization  requires  a  maxi muTn  likelihood 
routine  capable  of  handling  nonlinear  constraints.  In  our  experience 
the  values  of  the  estimators  obtained  have  been  close  to  the  values  of 
the  estimators  given  by  (3.1).  In  theory,  however,  these  estimators  are 
not  equivalent  to  the  ones  given  in  (3.1),  even  for  large  samples,  nor 
is  their  behavior  easily  tractable.  (see  Anderson  (1978)  for  similar 
remarks) . 


In  addition  to  the  computational  complexity  the  stationary  model  is 
unattractive  because  panel  data  with  only  a  few  (4  or  less)  waves  are 
not  likely,  in  our  experience,  to  be  stationary.  We  prefer  to  treat 
as  an  independent  parameter  to  be  estimated  from  the  data. 
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3.2  Assessing  the  Overall  Effects  and  the  Cross-effects 

Having  considered  the  problem  of  estimating  the  regression  parameters 
and  covariance  matrices  of  the  two-wave  model  we  turn  to  the  problems 
of  assessing  the  strength  of  the  effects.  We  begin  with  consideration 
of  the  problem  of  testing  the  significance  of  the  overall  effect  of 

the  variables  at  wave  0,  on  the*  variables  at  wave  1.  We  then 
consider  the  problem  of  testing  the  significance  of  the  cross-effects 
or  effects  of  and  and  of  on  We  conclude  with  consider¬ 

ation  of  the  problem  of  summarizing  the  strength  of  association  due  to 
the  existence  of  cross-effects  in  the  model. 

For  the  model  with  normal  errors  the  significance  of  the  overall  effect 
of  on  is  tested  by  any  of  several  standard  procedures  used  to 
test  the  hypothesis  ^  in  multivariate  linear  models.  These 

tests  are  covered  in  texts  on  multivariate  statistics  (Anderson  (1984), 
Rao  (1973))  and  are  evaluated  by  most  standard  multivariate  regression 
programs  such  as  the  routine  in  SAS. 

Of  the  various  tests  (likelihood  ratio,  Lawley-Hotelling,  Filial,  etc.) 
we  tend  to  prefer  the  likelihood  ratio  test,  in  part,  because  it  is  the 
most  theoretically  compatible  with  maximum  likelihood  estimation  and  the 
latter  is  used  to  estimate  the  models. 

For  many  panel  studies  the  overall  effects  of  the  variables  at  wave  0  on 
the  variables  at  wave  1  are  highly  significant  owing,  in  part,  to  the 
high  degree  of  serial  correlation  found  in  each  individual  variable. 
Formally,  the  diagonal  elements  of  the  effects  of  on  tend 

to  be  large  (see  the  example  in  section  5) . 
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The  more  tentative  and  more  intriguing  issue  is  the  significance  of  the 
effects  of  each  subset  of  variables  at  wave  0  on  the  other  subset  of  1 , 
the  effects  of  on  and  of  on  x^^^ .  These  cross-effects  are 
the  off-diagonal  blocks,  '^21*  ^  hypothesis  to  be 

tested,  the  hypothesis  of  no  cross-effects  is  =  B'  =  0. 

U  ‘'^12  '^21  ''j 

The  hypothesis  of  no  cross-effects  cannot  be  tested  with  a  standard 
multivariate  regression  routine  because  the  null  hypothesis  cannot 

be  expressed  in  terms  of  linear  contrasts  in  B  (see  the  appendix  for 
additional  comments) . 

(2) 

One  approach  to  testing  is  to  divide  the  hypothesis  into  B,o  =  0 

u  ^12 

and  ®21  °  and  then  apply  standard  statistical  theory  to  test  each  of 

the  component  hypotheses.  The  components  can  be  represented  as  standard 

contrasts  in  and  thus  a  multivariate  regression  routine  yields  the 

test.  Then  some  method  is  used  to  combine  results  of  the  two  tests 
(2) 

into  a  test  of  .  .  One  method  (Kessler  and  Greenberg  (1981))  combines 
the  two  tests  by  ignoring  the  correlation  between  them,  and  thus, 
implicitly  assumes  the  tests  are  independent.  This  method  seems  reason¬ 
able,  but  may  give  an  advertised  probability  of  type  1  error  quite 
different  from  the  true  probability  of  a  false  rejection,  the  difference 
arising  from  the  correlation  between  the  errors  of  the  equations. 

(2) 

We  choose  not  to  divide  but,  instead,  to  develop  the  likelihood 

(2) 

ratio  test  of  Hq  ,  a  test  which  takes  into  account  the  correlation 
structure  of  the  errors.  We  are  not  able  to  give  a  closed  form 
expression  for  the  test  statistic  but  instead,  evaluate  the  test 
statistic  directly  from  two  applications  of  a  standard  maximum 
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likelihood  estimation  routine.  The  easiest  explanation  of  the  likeli¬ 
hood  ratio  test  involves  presenting  the  model  in  the  widely  used 
notation  associated  with  the  LISREL  routine  for  maximum  likelihood 
estimation  (JSreskog  (1979)).  We  will  keep  notations  distinct  by  using 
a  superscript  dot  to  indicate  vectors  and  matrices  in  the  LISREL 
notation. 


Begin  with  the  LISREL  model 


(3.5) 

(3.6) 

(3.7) 


.  .  ... 

where  y  and  x  are  observed  random  vectors;  p,  and  Z  are  unobserved 

«  «  •  • 

random  variables;  A  ,  A  ,  B,  and  F  are  unknown  parameter  matrices;  and 

«  • 

0  =  cov(e)  =  cov(5) 

'Vi  '\iO  'Vi 

e  • 

$  =  cov(,|)  =  cov(,^) 


To  represent  the  two-wave  panel  model  in  this  notation  let  ^  = 

•••’  dimensional  vectors 

•  «  •  • 

and  let  A  =  A  =  0  and  0  =  0.  =  0.  Equations  (3.5)  and  (3.6)  become 

'Vix  'Viy  'ViE  'Vio 

trivial  identities  and  (3.7)  becomes 


(3.8) 


Assuming  B 


0, 


r  =  B, 

%  '\j 


$  = 
•\j 


and  =  }’.  completes  the  representation  of 


the  two-wave  model  into  LISREL  notation. 
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2 

Let  the  model  specified  above  be  fit  to  the  data  and  let  X  be  the  chi- 
square  statistic  produced  by  LISREL  which  indicates  the  badness-of-f it 
of  the  model . 

(2) 

To  obtain  the  likelihood  ratio  test  of  Hq  a  second  model  is  fit  to  the 
data,  a  model  which  is  the  same  as  the  model  above  except  that  the 
regression  parameter  matrix  ^  is  constrained  to  be  block  diagonal  as 
indicated  by  the  null  hypothesis  ^  ^0 

chi-square  badness  of  fit  statistic  for  this  constrained  model. 

The  likelihood  ratio  test  statistic  is  the  simple  difference 
2  2 

=  Xq  “  X  and  the  test  of  the  hypothesis  of  no  cross-effects  precedes 
by  treating  t^  as  a  chi-square  statistic  with  2pq  degrees  of  freedom. 

Note  that  the  test  presented  above  requires  two  maximizations  of  the 
likelihood  function,  or  equivalently,  two  runs  of  LISREL.  Two  alterna¬ 
tives  exist,  neither  of  which  requires  these  two  maximizations  and  both 
of  which  can  be  performed  with  a  multivariate  regression  routine. 

The  first  alteimative  is  to  approximate  the  test  statistic  t^  by 
application  of  Zellner’s  theory  of  seemingly  unrelated  regressions 
(Zellner  (1962)).  This  approximation  is  outlined  in  the  appendix. 

The  second  alternative  is  to  use  a  test  based  on  the  distribution  of  the 
estimators  of  the  cross-effects  ignoring  the  null  hypothesis.  Such 
tests  are  called  Wald-type  tests  (eg.  Judge,  et.  al.  (1980)).  The  test 
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statistic,  t^.  is  obtained  from  the  asymptotic  distribution  of 

/n(b.  -  b.')*  Itt  particular  we  let  vector 

formed  from  the  rows  of  B  and  b  the  vector  formed  from  the  rows  of 

2  2  .  ^  /^/^(vec)  ,  (vec). 

B.  Let  U  be  the  k  x  k  covariance  matrix  of  /n(b  -  b  ) 

v»  V 

pieced  together  from  the  covariance  matrices  given  for 

^  fvecl  ''  (vec)  .  . 

Theorems  1,  2  or  3.  Let  b,-  be  the  subvector  of  b  containing 

''jiz  “ 

the  elements  of  Bj^2  ^21  ^12  corresponding  subvec- 

tor  of  let  U,  ^  be  the  2pq  x  2pq  submatrix  of  U  containing  the 

covariance  matrix  of  i/n(b^2^®^  -  ^  *  Under  the  hypothesis  of  no 

cross-effects  =  0  and  the  test  statistic  is 

•\jI  y  'V» 


_  ~(vec) 

-  ku 


Hiz 


r(vec) 

^.12 


which  is  for  large  samples,  a  chi-square  statistic  with  2pq  degrees  of 


freedom. 


One  advantage  of  the  test  based  on  t^  over  the  test  based  on  t^  is  that 
if  the  assumption  of  normal  errors  is  relaxed  the  test  based  on  t^ 
remains  accurate  for  large  samples. 

If  the  assumption  of  normal  errors  holds  then  the  tests  based  on  the 

likelihood  ratio  statistic  t^^  and  the  Wald-type  statistic  t^  are  almost 

identical  for  large  samples.  For  samples  of  moderate  size  the  powers  of 

the  tests  can  be  compared  by  selecting  an  alternative  hypothesis 

H  ;  B,„  =  B.  and  B„,  =  B_,  generating  multiple  data  sets  with  B  and 
A  ''>12  ''>A  ''>21  ''>0  ''A 

B  as  the  true  values  of  B,„  and  B„, ,  and  then  calculating  the  propor- 
'ViB  'V'iZ  ''>Zi 

tion  of  data  sets  for  which  each  test  rejects.  Preliminary  calculations 


15 


suggest  that  the  powers  of  the  tests  are  very  close  if  n  is  50  or 
larger.  Note  however  that  such  simulation  gives  indication  of  the  per¬ 
formance  of  the  tests  when  the  errors  are  truly  normal.  We  conjecture 
the  Wald-type  test  may  perform  slightly  better  than  the  likelihood  test 
if  the  errors  are  far  from  normal.  Obviously,  a  more  complete  simula¬ 
tion  study  is  needed  (see  Evans  and  Savin  (1982)  for  more  on  the 
relationship  between  these  tests  and  other  tests  when  used  with  an 
econometric  model  containing  lagged  dependent  variables;  Also,  see 
Rothenberg  (1982)  for  the  argument  that  for  the  multiple  regression 
model  the  tests  have  similar  power  properties  for  samples  of  moderate 
sizes.  These  results  do  not  apply  directly  to  the  cross-lagged  panel 
model  -  because  of  the  presence  of  lagged  predictors  -  but  suggest 
further  study  is  needed. 

We  close  our  analysis  of  the  two-wave  model  by  considering  the  problem 
of  summarizing  the  degree  of  association  in  the  model  due  to  the  inclu¬ 
sion  of  the  cross-effects. 

One  of  the  most  widely  used  statistics  in  interpreting  the  output  of  a 
multiple  regression  analysis  is  the  square  of  the  partial  correlation 
coefficient  which  indicates  the  percentage  of  variation  in  the  dependent 
variable  explained  by  some  variables  controlling  for  others.  This 
statistic  is  particularly  attractive  in  that  it  is  a  proportional 
reduction  in  error  statistic  where  the  sum  of  squares  residual  is  used 
as  the  measure  of  error.  Thus  it  has  a  simple  interpretation  which,  at 
times,  is  valid  across  models  and  variables. 
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Following  Sobel  and  Bohrnstedt  (1985)  a  proportional  reduction  in  error 
measure  somewhat  similar  to  the  squared  partial  correlation  coefficient 
can  be  used  to  summarize  the  reduction  in  the  chi-square  badness-of-fit 
statistic  when  the  cross-effects  are  included  in  this  model.  To  be 
specific  we  suggest  the  measure 

PRE  =  (Xq  -  X^)/Xq 

gives  some  indication  of  the  importance  of  the  cross-effects  in  the 
panel  model  controlling  for  the  other  effects.  Note  that  Sobel  and 
Bohrnstedt  extend  this  type  of  measure  to  one  that  compares  the  models 
with  and  without  cross-effects  to  "baseline  models."  These  extended 
measures  could  be  used  with  the  two-wave  panel  model.  The  reader  should 
consult  their  work  for  details. 


4.  Multi-wave  Panel  Studies 

The  issues  that  arise  in  estimating  the  parameters  and  testing  the 
hypothesis  of  no  cross-effects  for  a  multi-wave  study  are  somewhat 
different  than  those  found  in  the  analysis  of  a  two-wave  study.  First, 
the  issue  of  whether  to  treat  the  initial  wave  as  fixed  or  random  is 
less  critical  since  the  regressor  variables  include  lagged  enodogenous 
variables  under  either  assumption.  Secondly,  the  assumption  that  B  is 
homogenous,  or  constant,  over  waves  is  often  questionable  and  is  open  to 
examination . 


For  a  study  of  T  +  1  waves  (T  =  2,3  ...)  the  regression  structure  in 
(2.1)  extends  to 


'vit 


^  vit 


1—  1,  ..., 
t  =  1 


B  z .  .  T 


9  •  *  •  9 


n 

T 


(4.1) 
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If  we  ,Zj.  -  z^^)'  and  =  (e^^,  ....  '  then  the  regres¬ 

sion  structure  in  (4.1)  can  be  summarized  as 


Z  =  Z  ,  B'  +  E 

'\jt  'V-t-l 


T-1 


We  assume  (  Z  Z^  Z^)  is  of  full  rank  and  define  two  estimators 


T-1 


B 

'\j 


,-l 


' it-iV 

t=o  t=l 


and 


(4.2) 


i  -  -  ^t-i  -  it-i  k">' 


These  estimators  are  pooled  least-squares  estimators  and  are  natural 
extensions  of  the  estimators  given  in  (3.1). 


The  vector  ,  the  j  th  row  of  can  be  obtained  by  regressing  all 


Tn 


observations  of  the  form  z^^^  on  the  appropriate  values  of 
(k) 

with  a  multiple  regression  routine.  Then  can  be  computed  from 

A  A 

(4.2).  Alternatively  B  and  |  can  be  obtained  directly  as  output  of  a 
multivariate  regression  routine,  or  of  a  maximum  likelihood  estimation 
routine  such  as  LISREL.  As  a  partial  extension  of  Theorem  1  we  have 


(1) 


9  •  •  •  y 


Theorem  4:  For  the  multi-wave  panel  study  with  normal  errors  the 

A  A 

estimators  and  given  in  (4.2)  are  maximum  likelihood  whether  the 
initial  observations  are  fixed  or  normal. 

The  exact  distribution  of  B  does  not  follow  from  the  normality  of  the 
errors  as  it  did  in  the  two— wave  model,  and  thus  no  extension  of  the 
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distribution  result  in  Theorem  1  is  possible.  Theorems  2  and  3  do 
extend,  in  part,  to 


Theorem  5;  For  the  multi-wave  panel  study  with  fixed  initial  observa- 

A  A 

tions  ^  and  2  are  consistent  estimators  (in  n)  and  the  vectors 

“  )j.)^  are,  for  large  samples,  normal  random  vectors  with  mean 
and  covariance  matrices 


cov[»^(]^j  "  ^ 


-1 


cov[»^(]^^  -  q  ^ 

If  the  initial  observations  are  normal  then  ~  have  the  same 

large-sample  distribution  with  q  ^  replaced  by  9^. 


We  turn  to  the  problem  of  testing  the  significance  of  the  effects  and 
cross-effects  in  the  multi-wave  panel. 


We  again  adopt  the  notation  of  the  LISREL  routine.  We  use  the  identifi¬ 
cations  of  section  3  with  one  extension  which  is  as  follows  for  the 
3-wave  model: 


I  = 


%  ~  ^^10’ 


and  let 


q] 

q 

r  = 

f\i 

% 

q 

1 

q 

q 

I 
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The  representation  for  a  general  T  +  1  wave  panel  is  a  direct  extension 
of  the  above. 


The  likelihood  ratio  teat  of  the  significance  of  the  effects,  or  equiva¬ 
lently  ,  the  test  of  hypothesis  =  (^  is  not  produced  by  either  a 
multiple-  regression  routine  or  a  multivariate  regression  routine.  It  is 
obtained  from  LISREL  by  fitting  the  model  above  to  the  data  and  then 
fitting  the  model  again  but  with  the  constraint  ^  =  {J.  The  test 
statistic,  t,  is  the  difference  between  the  chi-square  badness-of-f it 
statistics  for  the  two  models.  For  large  samples  it  is  approximately  a 
chi-square  random  variable  with  k  =  (p  +  q)  degrees  of  freedom. 


Turning  to  the  problem  of  testing  the  significance  of  the  cross-effects 

in  the  multi-wave  panel  models,  the  likelihood  ratio  test  is  generated 

by  a  direct  extension  of  the  one  used  for  the  two-wave  model.  Let 
2 

X  (T  +  1)  be  the  chi-square  badness-of-f it  measure  for  the  T  +  1  wave 
2 

model  and  let  Xg(T  +  1)  be  the  same  measure  under  the  constraint  that 
^12  “  ^21  “  or  equivalently,  under  the  hypothesis  that  ^  is  block 
diagonal.  The  test  statistic  is  the  difference 

tg  =  Xq(T  +  1)  -  X^(T  +  1) 

which  for  large  samples  is  approximately  a  chi-square  random  variable 
with  2pq  degrees  of  freedom. 


If  a  multiple  regression  routine  or  a  multivariate  regression  routine 
used  then  the  likelihood  ratio  test  can  be  approximated  by  using  a 
method  similar  to  the  method  used  to  approximate  the  test  statistic  t 


1 


is 


in  section  3. 
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A  Wald-type  test  of  the  significance  of  the  cross-effects  based  on  the 
large  sample  distribution  of  (for  the  multi-wave  model)  can  also  be 
used.  The  test  statistic  t^  is  the  natural  generalization  of  the  test 
based  on  t^  fot  the  two-wave  model  and  is  summarized  in  the  appendix. 

The  test  based  on  t^,  like  the  test  based  on  t^  for  the  two-wave  panel, 
can  be  used  with  either  the  assumption  of  normal  initial  observations  or 
the  assumption  of  normal  errors  relaxed. 


Note  that  the  hypothesis  of  no  cross-effects,  as  we  have  defined  it,  is 

not  the  hypothesis  that  the  sets  of  variables  and  are  independent. 

(2) 

The  hypothesis  of  Independence  is  stronger  than  the  hypothesis  Hq 
Assuming  the  errors  are  normal  this  stronger  hypothesis  is  the  inter¬ 
section  of  the  hypothesis  of  no  cross-effects  and  the  hypothesis 
H*:  “  I2I  ^  where  ^21  off-diagonal  blocks  of 

Anderson  (1978)  develops  a  test  for  the  hypothesis  H*  and  then  a 
conditional  test  for  the  hypothesis  of  no  cross-effects  given  H*  is 
true.  We  do  not  use  his  test  but  note  that  it  can  be  applied  directly 
to  the  multi— wave  panel  model  with  normal  initial  observations  and 
normal  errors. 


Finally,  for  a  multi-wave  panel  study  the  assumption  of  homogeneity  of 
the  parameter  matrix  B  across  can  be  tested.  Relaxing  this  assumption 
the  model  in  (4.1)  becomes 


?.t 


=  Z  ,  B'  +  E 
•bt-l  'v-t 


t  =  1,  . . . ,  T 
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where  the  terms  are  as  defined  in  Section  2  except  that  the  parameter 
matrix  depends  on  t. 

The  maximum  likelihood  estimator  of  is 
^t  ~  ^^t-1  ^t-P  ^t-1 

In  the  context  of  the  multivariate  autoregressive  process,  Anderson 
(1978)  develops  a  test  of  the  hypothesis  of  homogeneity  =  ... 

which  uses  the  test  statistic 

T  ^  ^ 

-  n  tr  y  (i  I  -1 

A  A  ' 

where  ^  and  are  defined  in  (4.2).  If  the  Initial  observations  are 
normal  and  the  errors  are  normal  then  t^  is,  for  large  samples,  almost 
a  chi-square  random  variable  with  (T-l)pq  degrees  of  freedom.  For 
multi-wave  panel  studies  t^  can  be  used  to  test  the  assumption  of 
homogeneity . 


The  Sobel  and  Bohrnstedt  proportional  reduction  in  error  measure  extends 
immediately  from  the  two— wave  to  a  multi— wave  model. 

5.  Application  of  Methods  to  a  Panel  Study 

The  upper  management  of  a  consortium  of  health  maintenance  organizations 
(HMO's)  wants  to  know  if  there  is  significant  correlation  between 
patients  attitudes  toward  health  maintenance  organizations  and  their 
perceptions  of  the  quality  of  care  they  are  receiving  from  the  HMO  in 
which  they  enrolled.  If  so,  they  would  like  to  know  if  over  time  there 


22 


appears  to  be  a  "causal  priority"  between  such  attitudes  and  percep¬ 
tions.  Does  the  patient's  attitudes  toward  HMD's  precede  or  "drive"  his 
or  her  perceptions  of  the  quality  of  the  care  being  received?  If  yes, 
then  management  might  want  to  invest  resources  in  a  campaign  to  improve 
the  attitudes  of  the  general  public  toward  HMD's.  Dn  the  other  hand 
suppose  the  effect  over  time  is  reversed  with  the  perceptions  of  the 
quality  of  care  preceding  or  "driving"  the  attitude  toward  HMD's.  Then 
management  may  want  to  invest  those  same  resources  in  a  more  focused 
campaign  to  improve  the  patient's  perceptions  of  the  care  received. 

To  obtain  a  preliminary  insight  into  the  issue,  management  conducted  a 
survey  of  randomly  selected  patients  enrolled  in  their  member  HMD's. 

For  a  variety  of  reasons,  including  minimizing  cost  and  disruption, 
a  panel  design  was  used.  Patients  were  interviewed  upon  completion  of 
each  self “initiated  visit  to  the  HMD,  the  visits  being  considered  waves. 
To  demonstrate  our  methods  we  analyze  a  subsample  of  50  patients  over 
three  waves.  This  subsample  was  chosen  randomly  but  for  the  sake  of 
simplicity  patients  with  incomplete  data  or  unusual  health-care  status 
(eg.  the  terminally  ill)  were  not  considered  for  this  subsample. 

Two  indicators  of  attitudes  toward  HMD's  are  used  in  our  analysis.  The 
first,  indicates  the  patients  attitude  toward  the  specific  HMD  in 

which  or  she  is  enrolled.  The  second  X^,  indicates  the  patients 
attitude  toward  the  concept  of  "socialized  medicine"  meaning  the  govern¬ 
ment  providing  the  general  public  a  low-cost  alternative  to  fee-for- 
service  health-care.  These  variables  were  thought  to  capture  two 
closely  related  dimensions  in  the  overall  attitude  toward  HMD's.  Both 
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variables  are  rescaled,  in  part  to  mask  propriety  data,  to  have  mean  10 
and  a  standard  deviation  of  about  3  with  a  higher  value  indicating  a 
more  positive  attitude. 

Two  perceptions  of  quality  of  health-care  received  are  included.  The 
first,  ,  indicates  the  perceptions  of  the  quality  of  care  received 
in  the  visit  just  concluded  and  the  second,  Y^,  indicates  the  percep¬ 
tion  of  the  quality  of  care  received  since  initial  enrollment.  The  two 
variables,  while  related,  were  thought  to  capture  different  issues  of 
quality  of  care.  Again,  the  variables  were  scaled  over  a  larger  sample 
to  have  mean  of  10  and  standard  deviation  of  about  3  with  a  higher 
value  indicating  a  more  positive  perception. 

Table  1  displays  some  correlational  statistics  for  the  raw  data.  The 
correlation  matrix  shows  that  there  is  some  (marginal)  relationship 
between  the  two  indicators  of  the  attitude  toward  HMO's  (r  =  .38)  but 
little  relationship  between  the  two  pairs  of  measurements.  The  multiple 
correlations  indicate  that  the  attitudes  and  perceptions  at  wave  t  are 
well  predicted  by  the  attitudes  and  perceptions  at  wave  t  -  1 .  The 
comparison  between  the  multiple  correlation  and  serial  correlations 
indicates  that  the  majority  of  this  predictability  can  be  attributed 
to  the  relationship  between  a  measurement  at  wave  t  -  1  and  the  same 
measurement  at  wave  t.  For  each  variable  the  serial  correlation  is 
within  .02  of  the  multiple  correlation.  Along  with  these  summary 
statistics  the  usual  data  descriptive  methods  (histograms,  box  plot, 
stem  and  leaf  plots,  and  others)  were  used  to  examine  the  shapes  of 
the  distributions  of  the  measurements.  All  were  quite  symmetric  and 
fairly  normal  with  no  significant  outliers. 
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Table  2  suimnarizes  the  output  of  a  multiple  regression  routine  applied 
to  each  variable  individually.  The  estimates  of  the  regression  para¬ 
meters  are  the  maximum  likelihood  estimates  of  Theorem  4. 

The  cross-effects  seem  small  when  compared  to  their  estimated  (marginal) 
standard  error  but  Table  2  does  not  provide  a  correct  test  of  the 
significance  of  the  effects  or  cross-effects.  For  these  we  turn  to 
Table  3  which  again  displays  the  maximum  likelihood  estimates  of  the 
regression  parameters.  These  estimates  can  be  obtained  from  the 
regression  output  in  Table  1,  from  a  multivariate  regression  routine, 
or  from  a  maximum  likelihood  routine  such  as  LISREL. 

The  ratio  of  each  estimate  to  its  (correctly)  estimated  standard  errors 
are  given  in  Table  3.  The  estimates  of  the  standard  errors  used  as  the 
denominators  are  obtained  from  a  multivariate  regression  routine, 
from  LISREL;  or  from  the  multiple  regression  output.  These  estimates 
are  the  estimates  of  Theorem  5  and  are  accurate  for  large  samples.  The 
ratios  are  approximately  normal  and  thus  give  a  rough  indication  of  the 
size  of  the  coefficient.  These  ratios  indicate  that  attitudes  at  wave  t 
-  1  are  good  predictors  of  attitudes  at  wave  t  and  that  perceptions  at 
wave  t  -  1  are  good  predictors  of  perceptions  at  wave  t. 

The  ratio  of  the  estimates  of  the  cross-effect  parameters  to  their 
estimated  standard  errors  are  all  quite  small.  These  ratios  Indicate 
that  neither  attitudes  nor  perceptions  are  good  predictors  of  the  other 
over  waves.  We  note  their  sizes  but  do  not  use  these  ratios  to  test  the 
hypothesis  of  no  cross  effects  since  the  ratios  are  not  independent. 
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Table  3  also  gives  the  likelihood  ratio  statistic  for  testing  the 
hypothesis  of  no  effects  (t  =  688.69)  which  can  be  compared  to  the 
critical  value  of  a  chi-square  distribution  with  16  degrees  of  freedom 
and  is  quite  significant.  This  result  is  as  expected  given  the  high 
degree  of  serial  correlation  indicated  in  Tables  1  and  2. 

Tbe  likelihood  ratio  test  statistic  for  the  hypothesis  of  no  cross- 
effects  is  also  given  (t^  =  4.81)  as  is  the  Wald— type  test  statistic 
(t^  ”  5.41).  The  former  was  obtained  from  two  runs  of  the  LISREL 
routine  and  the  latter  was  obtained  by  brute  calculation  using  the 
output  of  a  multivariate  regression  routine.  We  found  the  PROC  routines 
of  SAS  particularly  handy  for  both  the  multivariate  regression  and  for 
the  subsequent  calculations. 

Both  of  these  test  statistics  are  asymptotically  chi-square  with  8 
degrees  of  freedom  and  neither  is  significant  at  the  .05  level. 

Neither  test  gives  much  evidence  for  the  existence  of  cross-effects. 

The  final  hypothesis  of  interest  is  the  hypothesis  of  homogeneity, 
vl  ~  '^2*  test  statistic  for  homogeneity  (t^  =  173.11)  is  signifi¬ 

cant  at  the  .05  level.  This  test  indicates  that  the  matrix  of  regres¬ 
sion  parameters  appears  to  vary  over  waves. 

From  the  LISREL  output  the  chi-square  test  for  the  goodness  of  fit  for 

the  model  with  cross-effects  is  x  —  78,18  and  the  value  for  the  model 

2 

without  cross-effects  is  Xq  =  82,96.  Thus  the  Sobel-Bohrnstedt  type 
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_  82.96  -  78.18 

proportional  reduction  in  error  measure  has  value  (PRE;  - 82,96  ~ 

.06  which  indicates  that  adding  the  cross-effects  reduces  the  badness- 
of-fit  of  the  model  by  about  6  percent. 


The  analysis  suggests  that  the  attitudes  toward  HMD's  and  the  percep¬ 
tions  of  the  quality  of  health-care  received  are  not  highly  interactive 
over  waves.  The  conclusion  is  tentative,  in  part  because  the  evidence 
against  the  homogeneity  of  the  regression  parameter  makes  interpretation 
difficult. 


6 .  Future  Directions 

We  have  presented  statistical  results  for  estimating  the  parameters  and 
testing  the  hypotheses  of  no  effects  and  no  cross-effects • in  a  cross- 
lagged  panel  study.  Our  results  are  an  extension  of  the  work  of 
several  social  methodologists  on  the  regression  approach  to  modeling 
panel  data.  They  are  also  an  application  of  results  on  multivariate 
linear  inference  applied  to  the  panel  models  widely  used  by  social 
scientists.  We  extend  this  earlier  work  by  considering  the  simultaneous 
nature  of  the  regression  models  formulated  and  by  considering  the  nature 
of  the  observations  made  at  the  initial  wave.  We  provide  multivariate 
estimators  of  the  regression  parameters  and  tests  of  the  hypotheses  of 
no  effects  and  cross-effects.  We  also  consider  the  problems  of  summar¬ 
izing  the  contributions  of  the  cross-effects  to  the  degree  of  fit  of  the 
model  and  the  problem  of  testing  the  homogeneity  assumption  on  the 
regression  coefficients. 
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The  continuous  variable  panel  study  has  attracted  attention  in  many 
disciplines  including  econometrics.  There  focus  has  been  on  the  problem 
of  correctly  formulating  the  error  structure  in  the  regression  approach 
and  on  the  problem  of  estimating  the  model  under  different  formulations 
(e.g.  Balestra  and  Nerlove  (1966),  Maddala  (1971)  and  Wallace  and 
Hussain  (1969)).  Judge,  et .  al.  (1980)  give  an  excellent  review  of  this 
research. 

Anderson  and  Tsaio  (1981;  1982)  have  produced  an  excellent  piece  on  some 
of  the  statistical  issues  that  arise  from  a  particular  econometric 
specification  of  the  error  structure  in  a  univariate  panel  model.  They, 
and  most  econometricians,  study  a  regression  model  for  a  single  response 
variable  in  which  the  error  for  respondent  i  at  time  t  can  be  decomposed 
into  the  sum  of  two  (or  more)  terms,  the  first  of  which  depends  on  i  and 
t  and  the  second  of  which  depends  only  on  i.  The  second  term  allows  the 
model  to  capture  the  fact  that  the  respondent  may  tend  to  be  above  the 
predicted  value  from  the  regression  model  at  every  wave.  Various  sets 
of  assumptions  can  be  made  on  the  behavior  of  these  two  terms  over. 

For  several  panel  studies  we  have  adopted  this  econometric  error  formu¬ 
lation  and  it  seems  to  be  quite  realistic.  We  are  studying  the  problem 
of  testing  for  the  presence  of  cross-effects  with  this  type  of  error 
structure  (Mayer  (1985b)). 

We  are  also  studying  the  problem  of  detection  of  autoregressive  errors 
in  the  multivariate  panel  model  as  specified  in  this  paper  and  the 
problem  of  testing  for  the  presence  of  cross-effects  given  that  the 
errors  are  autoregressive  (Mayer  (1985a)). 
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Finally  we  are  looking  at  the  problem  of  estimating  the  parameter  matrix 
and  covariance  matrix  and  testing  for  the  presence  of  cross-effects  in  a 
panel  study  where  the  homogeneity  assumption  on  the  matrix  of  regression 
parameters  does  not  hold. 


29 


Table  1 

Correlation  Statistics  for  the  Raw  Data:  HMO  Data 
Correlation  Matrix  for  Raw  Data 


X  1.00  .38  .20  .20 

X2  1.00  .12  .05 

1.00  .38 

Y2  1 . 00 


Multiple  Correlations  and  Serial  Correlations 

Multiple  Correlations  Serial  Correlations 


X^: 

.96 

X^: 

.95 

X2: 

.91 

X^: 

.87 

.93 

.93 

Y2: 

.75 

^2^ 

.70 
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Table  2 

Multiple  Regression  Sunnnary:  EMO  Data 


Dependent  Variable;  (attitude  toward  HMO) 


Source 

df 

Sum  of  Squares 

Mean  Square 

F 

Prob  Value 

Model 

4 

1041,27 

260.32 

321.5 

<  .0001 

Error 

96 

77.73 

.81 

Total 

100 

1119.00 

Predictor 

Parameter 

Standard 

Prob 

Variable 

df 

Estimate 

Error 

t 

Value 

1 

.79 

.03 

31.33 

<  .0001 

""a 

1 

-.04 

.04 

-1.06 

.29 

; 

1 

-.03 

.03 

-1.05 

.30 

Y 

2 

1 

-.03 

.04 

-.56 

.57 

Dependent 

Variable;  (attitude 

toward  socialized  medicine) 

Source 

df 

Sum  of  Squares 

Mean  Square 

■  F 

Prob  Value 

Model 

4 

513.73 

128.43 

116.12 

<  .0001 

Error 

96 

106.18 

1.11 

Total 

100 

619.90 

Predictor 

Parameter 

Standard 

Prob 

Variable 

df 

Estimate 

Error 

t 

Value 

1 

-.06 

.03 

-1.97 

.05 

''2 

1 

.87 

.04 

19.99 

.88 

1 

.00 

.04 

.12 

<  .0001 

1 

-.02 

.05 

-.34 

.50 

2 
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Table  2 
(Continued) 


Dependent  Variable:  (perception  of  quality  of  current  care) 


Source 

df 

Sum  of  Squares 

Mean  Square 

F 

Prob  Value 

Model 

4 

443.85 

110.96 

137.01 

<  .0001 

Error 

96 

77.75 

.81 

Total 

100 

521.59 

Predictor 

Parameter 

Standard 

Prob 

Variable 

df 

Estimate 

Error 

t 

Value 

1 

.03 

.03 

1.19 

.24 

^2 

1 

.01 

.04 

.15 

.88 

1 

.65 

.03  20.59 

<  .0001 

^2 

1 

.03 

.04 

.67 

.50 

Dependent 

Variable 

;  Y2  (perception 

of  overall  care) 

Source 

df 

Sum  of  Squares 

Mean  Square 

F 

Prob  Value 

Model 

4 

274.41 

68,60 

30.70 

<■ .0001 

Error 

96 

214.54 

2.23 

Total 

100 

488.96 

Predictor 

Parameter 

Standard 

Prob 

Variable 

df 

Estimate 

Error 

t 

Value 

1 

.01 

.04 

.19 

.85 

^2 

1 

.01 

.06 

.11 

.91 

1 

.01 

.05 

.13 

.89 

^2 

1 

.72 

.07 

9.71 

<  .0001 

32 


Table  3 

Inferences  for  Panel  Model:  HMO  Data 


Estimates  of  Regression  Parameters 


.79 

-.04 

.03 

-.03 

-.06 

.87 

.00 

.02 

.03 

.00 

.65 

,03 

.01 

.01 

.01 

.72 

Ratio  of  Estimates  to  Estimated  Standard  Errors 


31.32 

-1,06 

-1.05 

-.56 

-1.97 

20.00 

.12 

-.34 

1.20 

.15 

20.59 

.67 

.19 

.11 

,13 

9.71 

Estimate  of  Covariance  Matrices 


-.14 

.20 

21.47 

6.73 

4.13 

4.20 

.03 

,28 

0  : 

8.393 

1.81 

.82 

.53 

.06 

13.72 

3.80 

2.59 

5.95 

Likelihood  Ratio 


Test  of  Presence 


of  Effects 


t  -  688.69 


approximately  chi-square  df  =  16 


Likelihood  Ratio 
t^  ^  4.81 


Test  of  Presence  of  Cross-Effects 

approximately  chi— square  df  =  8 


Wald  Type  Test  of  Presence  of  Cross-Effects 

t,  s  5.41  approximately  chi-square  df  =  8 

4 

Anderson  Test  of  Homogeneity  (H^:  =  B^) 

t^  =  173.11  approximately  chi-square  df  =  4 

Sobel-Bohmstedt  Proportional  Reduction  in  Error  Measure 


PRE  =  .06 
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A.  Appendix 

This  eppendix  suimiiariz es  the  technical  foundations  and  implications  of 
the  results  presented.  It  begins  with  consideration  of  the  two— wave 
model . 

A.l  Two-Wave  Model 

The  first  result  (Theorem  1)  follows  directly  from  the  theory  of  multi- 
linear  models  because  the  two— wave  model  with  fixed  initial  wave 
can  be  treated  as  a  multivariate  regression  model.  This  theory  also 
implies  that  the  distribution  of  is  normal  for  samples  of 

all  sizes  and  thus  the  normal  is  not  used  as  a  large-sample  approxima¬ 
tion. 

The  second  result  (Theorem  2)  also  follows  from  the  theory  of  multi¬ 
variate  linear  models  since  it  is  a  combined  application  of  the 
multivariate  extension  of  the  Gauss-Markov  Theorem  and  the  multivariate 
central  limit  theorem  (Anderson  (1984),  Rao  (1973)). 

The  third  result  (Theorem  3)  does  not  follow  from  the  theory  of  multi¬ 
variate  linear  models  since  the  proposition  stated  is  not  conditional  on 
the  initial  wave.  The  part  of  the  result  for  the  model  with  normal 
errors  does  follow,  however,  from  the  theory  of  replicated  vector-valued 
autoregressive  processes  developed  by  Anderson  (1978)  or  by  direct 
calculation.  The  log  of  the  likelihood  function  for  the  model  with 
normal  initial  observations  and  normal  errors  is 

i  .  c  -  i  log  |0^|  -  i  log  U|  -  f  ^ 

which  is  minimized  by  letting  ^  as  defined  in  (3.1). 
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The  remainder  of  Theorem  3}  the  large  sample  distribution  for  the 

vector  b  of  estimated  regression  parameters  for  the  model  without 

'V.j 

normal  errors,  follows  from  applying  the  multivariate  central  limit 
theorem  and  direct  calculation  of  the  covariance  matrices. 

The  assumption  of  a  stationary  covariance  matrix  is  more  attractive  in 

time—series  analysis  than  in  panel  analysis.  For  a  simple  time  series 

of  length  T  +  1  the  assumption  that  0^  is  of  a  particular  form  is  an 

assumption  on  a  single  observation,  Zq.  In  a  T  +  1  wave  panel  model 

the  assumption  is  on  all  the  n  observations  at  wave  0.  Should  the  data 

not  be  consistent  with  the  assumption,  the  covariance  matrix  0^  may  vary 

significantly  from  0^(  t  =  1,  ...,  T).  For  this  case,  the  assumption  of 

stationarity  imposes  significantly  on  the  estimates  of  the  matrices  B 

and  0^, 

'vt 
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In  our  notation  the  standard  hypothesis  of  multivariate  regression 
model  is 

»o<4-V=  =  “ 

where  B  is  the  matrix  of  regression  parameters  and  A  and  C  are  contrast 
matrices  which  "pick  out"  rows  and  columns  (Anderson  (1984), 
Dunteman  (1984)).  There  are  no  matrices  A,B  which  will  give  a  null 

'\j  ’\j  ® 

hypothesis  H  (A,B)  identical  to  :  B,^  =  B'  =  0. 

u  «  y  U  •  '>^21 


The  theory  of  likelihood  ratio  methods  is  a  central  theme  in  statistical 
theory.  This  thoery  yields  the  large  sample  distribution  of  the  test 
statistic  and  certain  optimality  properties  for  the  test.  (eg.  Cox  and 
Hinkley  (1974)).  Most  critical  for  the  model  at  hand  the  likelihood 
ratio  test  is  asymptotically  locally  most  pwoerful  under  relatively 
weak  conditons,  conditions  that  obtain  for  the  cross-lagged  panel  model 
(eg.  Cox  and  Hinkley  (1974)). 

The  likelihood  ratio  test  statistics  for  the  significance  of  the 
cross-effects  in  the  two-wave  model  (tj^)  can  be  approximated  by 
application  of  Zellner's  seemingly  unrelated  regression.  For  the 

simple  case  (p  =  q  =  1)  note  that  when  is  true  the  model  can  be 

expressed 


— 

— 

r—  -n 

^1 

^1 

= 

P.  ^0 

*  (n  a 

i _ 

+ 

1 

which  can  be  relabeled  as 


'V,l 


U 

'V/O 


(A.2) 
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u  and  v,  are  2n“Vector'Ss  U  is  a  2n  x  2  matrix  $  0  —  (ct>e)  is  a 

f\j][  '\j]^  ^ViO  ^ 

2-vector,  and  has  covariance  matrix  I 2  where  E  is  a  2  x  2  matrix. 


The  ing-gimiTm  likelihood  estimators  for  the  model  in  (A.  2)  can  be  approxi 
mated  as  follows.  [Zellner  (1962);  Schmidt  (1976);  Malinvaud  (1980)]. 
If  E  were  known  then  the  maximum  likelihood  estimator  of  6  would  be 

'Vi  “ 

the  weighted  least-squares  estimator 


(I  ai) 


-1 


'i-i 

'Vil 


•  t\j  'V, 

=  (a.e)' 


(A.3) 


and  thus  the  maximum  likelihood  estimator  of  _B  would  be 
B  =  diag(a,e)  = 

Since  E  is  unknown  S  is  approximated  by  substituting  a  consistent 
estimator  for  Z  in  (A.3).  To  this  end  let 

'V/ 


'V 

a 


€ 


4  ■  (PJV 


=  (a,e) 


be  the  least  squares  estimator  of  Q  and  let 


B^  =  diag(a,e) 


be  the  restricted  least  squares  estimator  of  B.  Estimate  E  by 


>\j  * 

and  then  approximate  the  maximum  likelihood  estimator  0  by  _0  =  (a*,£* 


4*  - 


where 
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and,  in  turn,  approximate  the  maximum  likelihood  estimator  B  by 


diag(o*,g*) 


a*  0 

0  3* 


(A.4) 


For  an  estimator  of  the  covariance  matrix  Z  we  use 


The  theory  of  seemingly  unrelated  regression  [Zellner  (1962), 

Schmidt  (1976) ]  yields  that  for  a  two— wave  panel  study  with  fixed 

initial  observations  and  normal  errors  if  the  hypothesis  of  no 

cross— effects  holds  the  estimator  B  is  asymptotically  equivalent 

to  the  maximum  likelihood  estimator  B  and  the  associated  estimator 
* 

Z  is  a  consistent  estimator  of  the  covariance  matrix  Z. 

'V 


The  likelihood  ratio  criterion  for  testing  the  hypothesis  of  no 
cross-effect  is  approximated  by  t*  =  -2  log  L*  where 

B, 

_  |J|  ^  e*p  (-  i  -  5^5*)’} 

n 

|s*l  ^  (- J  -  ^^).> 

For  a  two-wave  panel  study  with  fixed  initial  observations  and  normal 
errors  if  p  =  q  =  1  and  the  hypothesis  of  no-cross  effect  holds  then 
the  statistic  t^  is  asymptotically  chi-square  with  two  degrees  of 
freedom  and  the  test  based  on  tj^  is  asymptotically  equivalent  to  the 
likelihood  ratio  test  (based  on  tj^)  and  is  thus  an  asymptotically 
locally  most  powerful  test. 


42 


The  proof  follows  from  the  consistency  of , the  estimators  Z  and  Z  and 
from  the  theory  of  likelihood  ratio  tests  and  asymptotically  equivalent 
tests  (Cox  and  Hinkley  (1974)). 

For  the  more  general  two-wave  panel  study  (p  and  q  not  restricted)  the 

model  is  replaced  by  a  model  of  seemingly  unrelated  multivariate 

regressions.  The  construction  above  extends  to  this  more  general  case 

with  the  test  statistic  t^^  =  -2  log  being  asymptotically  chi-square 

(2)  . 

with  2pq  degrees  if  freedom  of  Hq  is  true. 
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A.  2  The  Multi-wave  Model 


The  first  result  for  the  multi-wave  panel  model,  (Theorem  4)  is  proved 
by  combining  the  major  result  on  estimating  the  parameters  in  a 
multivariate  autoregressive  process  [Anderson  (1978)]  or  a  result  in 
econometrics  on  estimating  the  parameters  in  a  recursive  linear  system 
(Malinvaud  (1980))  with  the  multivariate  central  limit  theory  (Anderson 
1984)). 

For  the  multi-wave  maodel  the  hypothesis  of  no  cross-effects  is  the 

(2\ 

same  as  for  the  two-wave  model  =  ^21  “ 

hypothesis  the  multi— wave  panel  model  can  be  expressed  as 

(t  =>  1,  . . . ,  T) 

Si^ce  the  regressor  variables  are  either  fixed  or  lagged  endogenous, 
the  maximum  likelihood  estimator  of  ^  can  be  approximated  by  extending 
the  scheme  based  on  Zellner's  seemingly  unrelated  regression  model.  This 
scheme  yields  an  estimator 

*  r^ii  ° ' 

^22_ 

which  is  an  extension  the  estimator  displayed  in  (A. 4)  and  which  is 
asymptotically  equivalent  to  the  restricted  maximum  likelihood  estimator 

of 
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Using  notation  familiar  in  the  analysis  of  multivariate  linear  models 
(cf.,  Anderson  (1984),  Rao  (1973))  the  error  "sum  of  squares"  matrix  is 
defined  by 

^  ^  n  (T-1)  E  (^  -  (^  -  ’ 

t=l 

and  the  hypothesis  "sum  of  squares"  matrix  is  defined  by 

I  =  nT  (^*  "  V 

The  test  statistic  T*  is  defined  by 

The  theory  of  likelihood  ratio  statistics  yields  that  for  a  multi-wave 
panel  study  with  fixed  initial  observations  and  normal  errors  if  the 
the  hypothesis  of  no  cross-effects  holds  then  the  test  statistic  t* 
is  asymptotically  chi-square  with  2pq  degrees  of  freedom.  Furthermore 
the  test  based  on  t*  is. asymptotically  equivalent  to  the  likelihood 
ratio  test,  based  on  t*. 

The  final  test  of  the  hypothesis  of  no  cross-effects  is  a  Wald-type  test 
and  follows  from  the  asymptotic  distribution  of  the  maximum  likelihood 
estimator  given  in  Theorem  4. 

Let  t  be  the  2pq  vector  of  all  elements  in  '^^2  ^21  ^ 

2pq  vector  of  the  elements  in  ^^2  ^21  conformally  with  t. 

Let  0"  be  the  submatrix  of  0  that  is  the  covariance  matrix  of 

'ViTT  ^ 

•\j 


45 


Let  0^  be  the  corresponding  submatrix  of 

‘\j 


0; 

'V.* 


then  let 


For  a  multi— wave  panel  study  if  the  initial  observations  are  fixed 
and  the  errors  are  normal  and  if  the  hypothesis  of  no  cross-effects 
holds,  t^  is  asymptotically  chi-square  with  2pq  degrees  of  freedom  and 
the  test  based  on  t^  is  asymptotically  equivalent  to  the  likelihood 


ratio  test. 
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20,  Abstract 

Cross-lagged  panel  studies  are  statistical  studies  in  which  two  or  more 
variables  are  measured  for  a  large  number  of  subjects  at  each  of  several 
waves  or  points  in  time.  The  variable  divide  naturally  into  two  sets 
and  the  primary  purpose  of  the  analysis  is  to  estimate  and  test  the 
cross-effects  between  the  two  sets.  Such  studies  are  found  in  the  main¬ 
streams  of  social,  behavioral  and  business  research.  One  approach  to 
this  analysis  is  to  express  the  cross-effects  as  parameters  in  regression 
equations  and  then  use  regression  methods  to  estimate  and  test  the 
parameters.  We  contribute  to  this  approach  by  extending  the  regression 
model  to  a  multivariate  model  that  captures  the  correlation  between  the 
dependent  variables.  Ue  develop  estimators  for  the  parameters  of  this 
model  and  hypothesis  tests  for  assessing  the  presence  of  effects  and 
cross-effects.  He  demonstrate  our  results  with  the  analysis  of  a  cross- 
lagged  panel  study  of  the  perceptions  and  attitudes  of  patients  toward  a 
health  maintenance  organization. 
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